Providing User-Support in Performing Knowledge Discovery in Databases

نویسندگان

  • Robert Engels
  • Michael Erdmann
  • Rainer Perkuhn
  • Rudi Studer
چکیده

Knowledge Management (KM) is becoming a success factor for industrial organisations. Obtaining control over and gaining information out of data helps to achieve the organisation’s goals more effectively. Thus knowledge (or information) becomes a very important resource. This resource must be adequately procured, stored, processed and communicated. These tasks are central points of Knowledge (and Information) Management which embraces key issues such as knowledge acquisition, data warehouses, data mining, data base management systems, knowledge representation, case-based reasoning, hyper media, workflow management, and decision support systems. As one can see, besides the information systems aspect AI methods and concepts play a significant role in Knowledge Management, esp. concerning procurement (knowledge acquisition), storing (knowledge representation), processing (e.g. data mining, decision support systems) knowledge. One important source of information and knowledge is the growing number of large databases which are built up and maintained in large organizations, but in the meantime also in a lot of small and medium size enterprises (SMEs). As consequence, the research and application area ’Knowledge Discovery in Databases’ (cf. [Frawley et al. 91], [Fayyad and Uthurusamy 95], [Simoudis et al. 96], [Fayyad et al. 96]) gained some importance during the last years. Successful KDD applications in marketing, financial investment, or network management indicate that the development of KDD applications may result in strategic advantages in performing the business tasks (cf. e.g. [Brachman et al. 96]). However, all experiences show that the development of (successful) KDD applications is a complex and errorprone process [Brachman and Anand 96]. In general, the KDD process consists of a task analysis step (for identifying the real application problem), a pre-processing step (for identifying and selecting relevant data), a data mining step, and a post-processing step (for evaluating the data mining results). At the moment, only first steps towards a comprehensive methodology that supports such complex and iterative KDD processes have been proposed (cf. e.g. [Wirth and Reinartz 96], [Engels 96]). On the other hand, there is a clear indication that such a methodology is needed since: ̄ especially in SMEs KDD applications will be developed by application specialists and not by KDD specialists, ̄ task analysis has to be supported in a systematic way in order to come up with a KDD problem specification which really meets the application needs, ̄ there exist so many dependencies between different KDD algorithms (e.g. for pre-processing and data mining) well as between data characteristics and applicable algorithms that a kind of planning support is required for achieving a well-defined and consistent KDD process, ̄ the KDD process is highly iterative which means that results of later steps (like e.g. evaluation) will have to fed back to earlier steps (like e.g. pre-processing). Our approach for providing user-support in performing Knowledge Discovery in Databases [Engels 96] aims at developing a methodology to support the user when performing entire KDD processes and to support the reuse of previously successfully applied KDD processes. The methodology should provide a user guidance module to enable application specialists (not necessarily KDD experts) to develop a KDD process by (re-)using successfully applied (parts of) KDD processes, data mining algorithms, and preand post-processing algorithms adequately. By supporting such a reuse-oriented approach the development time of new applications decreases while higher quality solutions are achieved. The user guidance module includes a repository which contains previously applied KDD processes, data mining algorithms, and pre/post-processing algorithms. To be able to retrieve these processes and algorithms (e.g. by applying case-based reasoning methods) there must exist a uniform description that highlights the differences in the functionality of these processes and algorithms. On the other hand, the initial

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A data mining approach to employee turnover prediction (case study: Arak automotive parts manufacturing)

Training and adaption of employees are time and money consuming. Employees’ turnover can be predicted by their organizational and personal historical data in order to reduce probable loss of organizations. Prediction methods are highly related to human resource management to obtain patterns by historical data. This article implements knowledge discovery steps on real data of a manufacturing pla...

متن کامل

Analysis of User Interface Environment in Scientific Databases According to the Viewpoints of Postgraduate Students Applying Dervin's Sense-Making Theory

Abstract Background and purpose: The purpose of this study was to analyze the user interface environment of some databases (Science Direct, Springer, Clinical Key, and Wiley online library) from the perspective of users applying Dervin's sense-making theory. Materials and methods: A cross-sectional descriptive study was conducted in 100 PhD students and research-based PhD students in Mazandar...

متن کامل

A Methodology for Providing User Support for Developing Knowledge Discovery Applications

Knowledge Discovery in Databases (KDD) currently receives much attention from both the research as well as the industrial world. This might be due to the fact that companies are often faced with continuously growing databases that become increasingly important for decision making, whereas traditional data analysis approaches might not be able to handle all the requirements of decision makers. I...

متن کامل

Knowledge Discovery from Multiple Databases

Knowledge discovery systems for databases are employed to provide valuable insights into characteristics and relationships that may exist in the data, but are unknown to the user. This paper describes a methodology and system for performing knowledge discovery across multiple databases. These enhancements have been integrated into the prototype knowledge discovery system called INLEN. The enhan...

متن کامل

Application of Rough Set Theory in Data Mining for Decision Support Systems (DSSs)

Decision support systems (DSSs) are prevalent information systems for decision making in many competitive business environments. In a DSS, decision making process is intimately related to some factors which determine the quality of information systems and their related products. Traditional approaches to data analysis usually cannot be implemented in sophisticated Companies, where managers ne...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002